A fast multigrid-based electromagnetic eigensolver for 
curved metal boundaries on the Yee mesh 1 ^ 



a.b 



Carl A. Bauer a '*, Gregory R. Werner a , John R. Cary 

a Department of Physics and the Center for Integrated Plasma Studies, University of 
Colorado, Boulder, Colorado 80309 
b Tech-X Corporation, Boulder, Colorado 80303 



Abstract 

For embedded boundary electromagnetics using the Dey-Mittra Q al- 
gorithm, a special grad-div matrix constructed in this work allows use 
of multigrid methods for efficient inversion of Maxwell's curl-curl ma- 
trix. Efficient curl-curl inversions are demonstrated within a shift-and- 
invert Krylov-subspace eigensolver (open-sourced at https : / / github . com/ 
bauerca/maxwell) on the spherical cavity and the 9-cell TESLA supercon- 
ducting accelerator cavity. The accuracy of the Dey-Mittra algorithm is also 
examined: frequencies converge with second-order error, and surface fields 
are found to converge with nearly second-order error. In agreement with 
previous work neglecting some boundary-cut cell faces (as is required 
in the time domain for numerical stability) reduces frequency convergence 
to first-order and surface-field convergence to zeroth-order (i.e. surface fields 
do not converge). Additionally and importantly, neglecting faces can reduce 
accuracy by an order of magnitude at low resolutions. 

Keywords: electromagnetics, finite difference, Yee, Dey, Mittra, algorithm, 

eigensolver, Maxwell, accelerator, multigrid, cavity 

2000 MSC: 78-04, 78M20, 65F15, 65N22, 65N55, 65N25, 65N06, 65Z05 



This work was supported by the U. S. Department of Energy grant DE-FG02- 
04ER41317. 

* Corresponding author: carl.bauer@colorado.edu 
Email address: carl.bauer@colorado.edu (Carl A. Bauer) 



Preprint submitted to Elsevier 



January 17, 2013 



1. Introduction 



The Dey-Mittra electromagnetics algorithm simulates smooth curvedper- 
fectly-conducting boundaries using the Yee finite-difference technique [!, [if. 
The algorithm is often called a cut-cell or embedded-boundary technique 
since the mesh does not conform to the geometry of the conducting boundary 
(grid cells, faces, and edges are "cut" by boundaries). In the time-domain, 
the Courant-Friedrichs-Lewy (CFL) condition reduces the accuracy of the 
Dey-Mittra algorithm by requiring the neglecting of some cut faces. More 
precisely, the CFL condition states that the maximum stable timestep is 
limited by the maximum eigenvalue (of the discretized curl-curl matrix) and, 
in the Dey-Mittra algorithm, the maximum eigenvalue can be inflated greatly 
by faces barely cut by a boundary. A trade-off between accuracy and wall- 
clock simulation time ensues; if fewer neglected faces are desired (greater 
accuracy) , the time-step must be reduced [H, 0] . In this paper, we consider 
the Dey-Mittra algorithm in the frequency-domain, where the CFL condition 
does not apply and the full accuracy of the method can be used. 

We begin by reviewing the two important aspects of the problem: (1) 
the Dey-Mittra algorithm and (2) eigensolving Maxwell's equations as dis- 
cretized on the Yee mesh (ultimately, this leads to the question: how 
does one invert the curl-curl operator? Fortunately, this is well-studied 

b a a a a 

[si). The advance of this paper is described in Sectional and 
amounts to a transformation of the discretized Dey-Mittra curl-curl operator 



that allows efficient inversion by multigrid techniques [10[. Proof of perfor- 
mance is given in the numerical results, where our eigensolver attacks the 
spherical resonant cavity and the 9-cell TESLA superconducting accelerator 
cavity. The code used throughout this paper is open-sourced, and can be 
found at jhttps : //github. com/bauerca/maxwell. 

2. The Dey-Mittra algorithm 

Electromagnetic cavity eigenmodes are solutions to Maxwell's wave equa- 
tion subject to perfectly conducting boundary conditions; a magnetic eigen- 
mode satisfies 

VxVxB = jfe 2 B infl (1) 
n • B = on dfl (2) 
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where f2 is the cavity interior, <9f2 is the perfectly conducting boundary, n is 
the normal to the boundary, and k = u/c, where u is the resonant angular 
frequency and c is the speed of light. We discretize Maxwell's equations with 
the finite-difference Yee algorithm |3( , labeling the grid electric and magnetic 
field components as e a \ijk and b a \ijk, respectively, where a is one of x, y, or 
z and i, j, and k are integer grid cell indices. Figure [T] shows the spatially 
staggered component layout of the Yee scheme which ensures the first-order 
accuracy (second-order error) of the discretized curl operators. In matrix- 
vector form, where b (e) is the vector of all b a \ijk (en\iik) components, the 



discretized version of Eq. ([I]) in vacuum is written [111 . Il2 



CC T b = k 2 b. 



(3) 



The Yee layout guarantees that the curl of the electric field is the transpose 
of the curl of the magnetic field, resulting in the symmetric matrix of Eq. [3] 
(the curl-curl matrix is also positive semi-definite, i.e. k 2 > 0). 
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Figure 1: Color online. Yee grid cellijk. Electric (Magnetic) field components are centered 
on edges (faces) . 



The Dey-Mittra algorithm is a modification of the Yee algorithm which 
simulates curved perfectly conducting boundaries in 3D with second-order 
error [l], 0] • The algorithm is based on the finite integral interpretation of 
the Yee algorithm 13|, |l4| where, for example, the Yee Faraday update for 
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b x \ijk (in the frequency domain) is written as 



— iuL 



x\ijk 



^x\ijk 



(j"y\ijk&y\ijk ly\ijk+l&y\ijk+l I z\ij+lk&z\ij+lk lz\ijk&z\ijk) j 

(4) 

which is a representation of Faraday's Law in integral form: — iu J B ■ da = 
<f ~E ■ dl. In the above, l a \ijk is the length of the edge of the Yee grid cell on 
which the component, e a ujk, is centered (see Fig. [Q. Similarly, a a \ijk is the 
area of the cell face on which the component, b a \ijk, is centered. In vacuum, 
Ay, and a x \^ k = AyAz such that Eq. (0J reduces to the 



x\ijk 



Ax, ly\ijk 



usual Yee finite difference expression. 
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Figure 2: An example of a cut face. The shaded region is conductor. In the Dey-Mittra 
method, the loop integral for Faraday's update is performed along the arrow-indicated 
contour. Because tangential electric fields are zero at perfectly-conducting boundaries, 
the line integral along the surface is skipped. 

When a face, a a \ijk is intersected by a perfectly conducting boundary, the 
Dey-Mittra algorithm takes l a Ujk an d a a \ijk to be the portion of the length 
and area, respectively, outside the conductor (see Fig. [2]). This is a physically 
meaningful representation of Faraday's Law in integral form since the electric 
field tangent to conducting boundaries vanishes. In matrix-vector form, the 
Dey-Mittra algorithm changes Eq. ()3]) to 

A^CLC^b = k 2 h (5) 

where A is a diagonal matrix of cell face fractions (e.g., Ar x \ijk)(x\ijk) = 
a x \ijk/ AyAz) and L is a diagonal matrix of cell edge fractions (e.g., 

~^- , (x\ijk){x\ijk) 1>x\ijk/ Ax) . 
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Vanishing area fractions lead to very large elements in the inverse area 
fraction matrix A -1 which can then inflate the maximum eigenvalue of the 
Dey-Mittra curl-curl matrix (relative to the vacuum Yee algorithm). For this 
reason, as summarized in the next section, the Dey-Mittra algorithm cannot 
be used to its full potential in the time-domain. Following the next section, 
the rest of this paper considers the frequency-domain, where this restriction 
is no longer an issue. 



3. Dey-Mittra: weakened for the time-domain 

To use Dey-Mittra in the time-domain, one must neglect certain cut faces 
with small area fractions, thus reducing the accuracy of the algorithm. In 
the standard Yee leap-frog time-stepping scheme, the Courant-Friedrichs- 
Lewy (CFL) condition states that the maximum eigenvalue of Eq. limits 
the maximum stable timestep. In 3D vacuum, the CFL condition on the 
timestep At is 

At < At CFL = 1 i ==. (6) 

In the Dey-Mittra algorithm, small area fractions amplify elements of the 
matrix A -1 . For some cuts, the edge lengths in the numerator of Eq. [4] do 
not compensate, and the largest eigenvalue of the Dey-Mittra wave equation 
(Eq. [5]) can be larger than that of the vacuum Yee wave equation (Eq. [3]). 
Therefore, the Dey-Mittra time-domain algorithm can require a much re- 
duced timestep as compared to At CFL [if. 

Alternatives to and modifications of the Dey-Mittra algorithm have been 
suggested that try to avoid the unfavorable CFL condition. The algorithm of 



15l | is complicated, but retains the second-order convergence of Dey-Mittra 
without reduction in timestep (compared to AtcFiJ by expanding the stencil 
for components with small area fractions (effectively averaging/enlarging the 
curl of E update for that component). Another simple modification of Dey- 
Mittra retains the small boundary stencil and the vacuum CFL timestep, but 
at the expense of second-order convergence [16( ; however, it achieves low ab- 
solute frequency errors in the numerical tests. A different approach corrects 



for the errors induced by stairstepping the boundary [17|, |l8j; however, to 
ensure energy conservation (and thus, long-time numerical stability), linear 
solves are required to find the operator coefficients on the boundary. This 
has been demonstrated only in 2D [181 ]. 
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With Dey-Mittra, the usual approach to avoid prohibitively small time- 
steps is to neglect faces with small area fractions, resulting in a pointy pertur- 
bation of the original conducting boundary. Unfortunately, neglected faces 
reduce the frequency convergence of the Dey-Mittra algorithm to first-order 
[2| . The alternatives mentioned in the previous paragraph perform well com- 
pared to the Dey-Mittra algorithm with neglected faces. 

Our new eigensolver allows the keeping of all cut faces without degrada- 
tion of the solution time (that is, given a simulation and resolution, solution 
times are the same whether faces are neglected or not). Given this ease, the 
effects of neglecting faces are probed in more detail in this paper as com- 
pared to the results of Ref. 0|. Namely, in addition to confirming the first- 
order frequency convergence induced by neglecting faces, we also highlight 
the stagnation of field convergence for fields on the surface of the conducting 
boundary. 

The technique used to neglect small face fractions in Ref. j2j analyzes the 
Gershgorin Circle for every boundary face to minimize the number ignored. 
For simplicity, we have implemented a less sophisticated face-neglecting tech- 
nique. We define a minimum area fraction, a m j n ; magnetic field components 
associated with area fractions less than a min are set to zero (i.e. the face is 
neglected). Electric field components on edges of the neglected face are also 
set to zero. 



4. Eigensolving the Yee vacuum wave equation 

Simulations of interest (especially in accelerator component design) rou- 
tinely require millions of grid cells to achieve the desired accuracy. The curl- 
curl matrix in Eq. [3] or [5] therefore has millions of rows and cannot be fully 
diagonalized. Iterative eigensolvers that search for only a small subset of the 
solutions are necessary. As a simplified preamble to our Dey-Mittra eigen- 
solving methods, we review in this section the currently preferred technique 
for eigensolving the discretized Maxwell's equations in vacuum (c.f. Eq. [3]). 
From there, only minor modifications to the technique will be required to 
explain our method. 

4-1. Krylov-subspace shift and invert eigensolvers 

Krylov-subspace iterative eigensolvers are widely-used and discussed in 



detail by |19|, |20|. In essence, the technique builds and refines a subspace in 



which approximations to true eigenpairs lie. The Krylov subspace is formed 
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by repeatedly applying the eigensystem matrix to an initial (usually random) 
vector. 

In Krylov-subspace methods, convergence to an eigenpair with eigenvalue 
A; is faster when Aj is at one extreme of the spectrum and the relative separa- 
tion of neighboring eigenvalues is large, i.e. |Aj — Aj + i|/|A max — A m i n | j2l|. The 
eigenmodes of interest (especially in accelerator component design) are often 
the lowest-frequency resonant modes; therefore, the usual scheme is to shift 
and invert the operator so that the eigenvalues of interest are at the top of 
the spectrum and are well-separated. Algebraically, for a generic eigensystem 
Hx = Ax, one solves the problem 

(H - rf)- l x = - J-x (7) 
A — a 

where I is the identity and a is some shift on the order of the smallest 
nonzero eigenvalue of H. An eigenvector of Hx = Ax is also an eigenvector 
of Eq. [71 The shift is used primarily to avoid nullspaces of H (which render 
it uninvertible). Theoretically, the shift could also accelerate convergence to 
any eigenpair by choosing o w \ if eigenpair Aj,Xj is desired. However, in 
practice, the inversion of H — o\ often (and quickly) becomes intractable as 
a increases toward the interior of the spectrum. 

4-2. Inverting the curl- curl matrix 

Building the Krylov subspace for the shifted and inverted system involves 
the repeated application of (H — erl) -1 . Because H (in our case, CC T ) is large, 
iterative linear solvers are required. The iterative inversion of Maxwell's curl- 
curl operator is well-studied and the current methods of choice rely heavily on 



the multigrid technique [10|, [22|]. When used on discretizations of the Lapla- 
cian operator in model problems (e.g. Poisson's equation on a Cartesian grid 
on a periodic square/cubic domain), multigrid methods can reduce residuals 



by an order of magnitude per linear solver iteration [10] (for a generic linear 
system Hx = y and an approximate solution x, the residual is ||Hx — y||). 
Modifications of standard multigrid techniques try to reproduce this behavior 
for the curl-curl operator 0, H, O, 0, 0, 0] • 

Special treatment must be given to the curl-curl operator because of its 
large nullspace (in the case of Eq. (J3J), the nullspace is the set of all modes 
with nonzero divergence and k 2 = plus the three uniform field modes); 
the existence of such a nullspace generally hinders the performance of stan- 
dard multigrid preconditioners jij. Since multigrid preconditioners perform 
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admirably on Laplacian-like matrices, one approach is to augment the curl- 
curl operator with a grad-div part to look more like a vector Laplacian (since 

V x V x — VV- = —V 2 ) in such a way that the electromagnetic eigenmodes 
are unchanged 0, |23[. This technique demands that the discretized differ- 
ential operators satisfy certain vector identities of continuous space (namely, 

V • Vx =0 and V x V = 0) and that the vector Laplacian has a trivial 
nullspace (such discretizations are called compatible; see Ref. 2^]). Indeed, 



the standard second-order divergence, curl, and gradient operators on the 
Yee mesh meet this requirement. 

In vacuum, the Yee magnetic divergence operator, D, acts on b to give a 
cell-centered scalar field ip; for Yee grid cell ijk, the operation is 

„/, (r\u\ b x \i + ijk — b x \ijk Oy|y+ifc — b y \ijk b z \ijj, + i — b z \ijk , , 

Wijk = [Ub) ijk = 1 1 . (8J 

Ax Ay Az 

The Yee layout ensures that DC = (the discrete equivalent of the vector 
identity, V • Vx = 0); also, the transpose is necessarily zero, which is a 
discrete version of the identity, V x V = 0. In fact, the discrete Yee magnetic 
gradient operator is — D T , which, for the component, b x ujk, takes the form 

(-D ip) x]ijk - — . (9J 

In the absence of any boundaries, the new eigenproblem to be solved (which 
approximates the vector Laplacian) is 

(CC T + D T D) b' = k' 2 h' (10) 

Because the range of C is in the nullspace of D, an eigenmode of Eq. ([3]) with 
nonzero eigenvalue is also an eigenmode of Eq. ffTOl) with the same eigenvalue 
(i.e. kf = kf and \ = for all kf > 0). 

The form of the matrix in Eq. [TD] now facilitates efficient inversion by 
multigrid methods; however, in exchange, modes with nonzero divergence 
(which originally had eigenvalues equal to zero in Eq. |3]) now pollute the 
spectrum. 

4-3. Projection 

Modes with nonzero divergence introduced by the grad-div operator can 
be removed at any time by the following projection operator 

P = I-D T (DD T )- 1 D (11) 
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where the matrix DD T is a scalar Laplacian and thus is easily inverted by 
multigrid techniques. To see that it is indeed a projection operator, note 
that P 2 = P. 

The projection works because an arbitrary magnetic vector b can be 
expressed as the discrete Helmholtz decomposition 

b = Ce + D T ^ (12) 

where e is an arbitrary grid electric field (associated with cell edges) and 
if) is an arbitrary cell-centered scalar field. The two terms are orthogonal 
under the Euclidean inner product since the Yee differential operators satisfy 
the vector identities DC = C T D T = 0. Furthermore, the two terms span 
the entire discrete magnetic vector space since it is known that the vector 
Laplacian in Eq. [10] has a trivial nullspace (in the absence of constant field 
vectors; e.g., for Dirichlet boundary conditions) (24j . 
Applying the projection operator to b in Eq. IT21 gives 

Pb = Ce. (13) 

We can now eigensolve Eq. [3]for the modes of interest; the following trans- 
formed eigensystem places eigenvalues of the low-frequency electromagnetic 
modes at the top of the transformed spectrum while zeroing the eigenvalues 
of the unwanted modes (with nonzero divergence): 

(I - D T (DD T )- 1 D)(CC T + D T D - al^b = -r^—b. (14) 

A single eigensolver iteration requires two matrix inversions — the shifted vec- 
tor Laplacian and the scalar Laplacian within P. 

5. Eigensolving the Dey— Mittra wave equation 

The Dey-Mittra algorithm amounts to a slight modification of the Yee 
vacuum curl-curl matrix. Therefore, we hope to find a slight modification of 
the vacuum Yee grad-div operator that complements the modified curl-curl 
and enables fast multigrid matrix inversions. The first part of this section 
constructs such a grad-div operator. Next, some simple matrix conditioning 
is performed to help the inversions and a projection operator is constructed 
similar to Eq. [TTJ 
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5.1. A grad-div for Dey-Mittra 

In this section, we construct a grad-div operator for the Dey-Mittra algo- 
rithm based on physical intuition. Later (Sec. EJ), we show that this operator 
leads to the nearly ideal performance of standard multigrid preconditioners 
on the resulting vector Laplacian. The choice of grad-div is suggested by a 
discrete form of Gauss' Law (in integral form) for the magnetic field. 

Equation (jSJ) can be rewritten as 

Ipijk {^x\i+ljkbx\i+ljk ^x\ijkbx\ijk dy\ij+lk^y\ij+lk 

Vijk 

@"y\ij kby\ij k Q-z|ijA;+l^z|iifc+l ^z\ijkbz\ijk) • (^^) 

where, in vacuum, is the volume of Yee grid cell ijk (in vacuum, = 
AxAyAz). The above can be interpreted as a discrete representation of 
Gauss' Law for the divergence of B: J V • Bdv = § B • da; ipijk represents 
the value of the magnetic divergence averaged over the volume of grid cell 
ijk and is calculated from the total flux of B out of that grid cell. 

Equation f|T5|) is naturally extended to Dey-Mittra boundaries where, for 
example, a x ujk < AyAz and < AxAyAz (see Fig. [3] for an example of a 
cut cell). If grid cell ijk is cut, is the volume of the cell that is outside the 
conductor. To find the volume-averaged divergence, ipijk, the outward flux 
of B must be calculated on the bounding surfaces of v^. Fortunately, the 
conducting boundary condition on B, Eq. (|2j), forces the normal component 
to zero, so that the boundary surface is not included in the flux calculation, 
leaving Eq. (TT5T) a physically meaningful expression. In matrix form, the 
Dey-Mittra divergence operator is 

D DM = V-'DA (16) 

where V is a diagonal matrix of cell volume fractions (V(ijk)(ijk) = 
v ijk / AxAyAz). 

With Dey-Mittra boundaries, the new eigenproblem for the discrete vec- 
tor Laplacian is then 

(A~ 1 CLC T + D T V _1 DA) b' = k' 2 b'. (17) 

Conveniently, the modified Dey-Mittra divergence operator has also guaran- 
teed that eigenmodes of Eq. (jSJ) with k 2 ^ (electromagnetic eigenmodes) 
are also eigenmodes of Eq. (|17|) since DAb = when b is an eigenmode 
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Figure 3: An example of cut cell ijk. The shaded region is conductor. We calculate the 
discrete divergence of B by summing the magnetic fluxes out of the unshaded volume 
and dividing by the unshaded volume. Since the normal magnetic field is zero on the 
conducting surface, that outward flux is ignored. 

of Eq. (jSJ); in fact, the inverse volume fraction matrix is irrelevant for this 
purpose — setting it equal to the identity would also ensure that electromag- 
netic eigenmodes are unchanged from Eq. (jSJ) to Eq. ( flTl) . However, as we will 
see in the numerical results section, the accurate calculation of the volume 
fractions is essential for good performance of multigrid preconditioners. 

5.2. Conditioning and projecting 

We have found it quite difficult to invert the left-hand side of Eq. [T7| 
however, if we multiply both sides by the area fraction matrix, A, then inver- 
sions using multigrid converge very quickly. We believe this is primarily due 
to ill-conditioning of the matrix in Eq. [T7] resulting from large elements in 
A -1 and V -1 . Applying the spectral shift and inversion to the A-transformed 
system gives 



It was also noted by one of our reviewers that the matrix in Eq. [IS] is sym- 
metric, which may further help the linear solver. 

As with the Yee scheme in vacuum, the eigenmodes with nonzero diver- 
gence must be removed (post-inversion) from the range of the vector Lapla- 
cian. Let the inner product on the discrete Yee magnetic field be 



(CLC T + AD T V -1 DA - ctA) Ab' 



1 



b'. 



(18) 



a 




(19) 
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then we can write an arbitrary discrete magnetic field vector b', as the 
Helmholtz decomposition 

b' = A -1 CLe + D T ^ (20) 

where e is a discrete Yee electric field vector and ip is some cell-centered scalar 
field (the two parts are orthogonal under the inner product of Eq. [TU]) . For 
a general vector, b', with the above decomposition, the following projection 
operator will eliminate the part derived from the gradient matrix: 

P dm = I - D T (DAD T )- 1 DA (21) 

where I is the identity matrix. In all of our numerical tests, the matrix DAD T 
was easily inverted by multigrid. Generally, we have found that if the linear 
solver can invert the vector Laplacian, it can invert the scalar Laplacian in 
Eq. [21] in a shorter time and in fewer iterations. 

The final transformed system to be diagonalized iteratively is 

(I - D T (DAD T )- 1 DA) (CLC T + AD T V _1 DA - aA) _1 Ab' = — — b' 

(22) 

The largest eigenvalues of the above matrix now correspond to k' ~ a, are 
relatively well-separated, and are purely electromagnetic (since the projection 
operator zeros the eigenvalues of modes with nonzero divergence). 

5.3. A transformation to resemble Yee 

It was observed by one of our reviewers that Eq. [22] can be written in a 
more compact form that makes a stronger connection with the Yee scheme. 
If we let 

D = V- 1 / 2 DA 1 / 2 C = A- 1 / 2 CL 1 / 2 b = A x / 2 b (23) 
then left-multiplying Eq. [22] by A 1 / 2 gives 

(l-D T (f3D T )- 1 f3) (ccF + DTD-rf) -1 ^— !— b' (24) 

which mimics Eq. [141 The transformations of Eq. [23] also compactly repre- 
sent the vector identities and Helmholtz decomposition for the Dey-Mittra 
algorithm: 

DC = C T D T = (25) 
b' = Ce + f) T ip, (26) 
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respectively. 

Unfortunately, numerical tests show that the elegant symmetric form of 
Eq. |23]ruins convergence, supporting the argument that the inflated elements 
of the A -1 / 2 operator within C are to blame (rather than overall operator 
symmetry). The form in Eq. [22] was used to obtain the following numerical 
results. 



6. Numerical results 

Our eigensolver makes extensive use of the trilinos linear algebra frame- 



work [25| to solve Eq. [22] Specifically, we use the block Krylov-Schur routine 



from the Anasazi package for the outer eigensolver iterations [26|, |27j, the 
GMRES linear solver from the AztecOO package for matrix inversions 28], 
and the algebraic multigrid (AMG) tool from the ML package as a precondi- 
tioner for the GMRES solver HQ- Within the AMG method, we use the 



polynomial-based multilevel smoother (of order 1) as described in [30| which 
exhibits good parallel performance (e.g. as compared with popular Gauss- 
Seidel smoothers) since only matrix- vector multiplications are required. The 
multigrid preconditioner used the V-cycle; therefore, the smoother was ap- 
plied twice at each level per iteration (once on the way to coarser grids, and 
once on the way back to the original fine grid) . The coarsest level was treated 
the same as any other multigrid level. The vector and scalar Laplacians to be 
inverted in Eq. [22] were formed explicitly, then passed to the ML algorithm. 

Cubic grid cells were used throughout the following tests (Ax = Ay = 
Az). 

Most simulations were performed on Hopper, the Cray XE6 supercom- 
puter at the National Energy Research Scientific Computing Center. 

6.1. Performance: spherical cavity 

In Table HJ we compare the performance of the linear solver (the bottle- 
neck) on a model problem (cubic domain with perfectly-conducting bound- 
aries) with a simple problem requiring the Dey-Mittra algorithm (the spher- 
ical cavity). No cut faces were neglected in the sphere simulation; the small- 
est area fraction encountered was 6 x 10~ 4 . The cubic domain had arbitrary 
side-length L, and the spherical cavity had radius 0.49L. Eigensolves were 
performed for the lowest three modes; each eigensolve took about 20 outer 
iterations to complete (20 vector/scalar Laplacian inversions). The figures 
in Table [1] are averages over these 20 inversions. 
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The multigrid complexity is defined as 

complexity = Z " i=1 -. - \ l> 27 
nnz(Ai) 

where Aj is the linear system matrix on multigrid level i (with i — 1 rep- 
resenting the finest level) and the nnz() operator produces the number of 
nonzero elements in the operand matrix. 



Table 1: Linear solver benchmarks for the inversion of the vector Laplacian without (cube) 
and with (sphere) Dey-Mittra boundaries. One polynomial (order 1) smoother sweep per 
multigrid level. Component count is the number of magnetic field components. Inversions 
were performed to 10 -6 accuracy. For every simulation, processor domains were 16x16x16 
cells. 





Cube 


Sphere 


Component count 


12,285 


98,301 


786,429 


7,815 


55,398 


414,489 


Avg. iteration count 


8.8 


9.3 


9.6 


10.0 


10.1 


10.5 


Convergence rate 


0.21 


0.22 


0.24 


0.25 


0.25 


0.27 


Multigrid levels 


3 


4 


4 


3 


4 


4 


Multigrid complexity 


1.5 


1.6 


1.6 


1.6 


1.7 


1.8 


Domain decomposition 


lxlxl 


2x2x2 


4x4x4 


lxlxl 


2x2x2 


4x4x4 



Iteration counts are strikingly similar between the model problem and 
the spherical cavity. However, if the inverse volume matrix is left out of 
Eq. [22l vector Laplacian inversions do not converge in a reasonable number 
of iterations (for a GMRES basis size of 40, more than 10 restarts were 
required — thus, more than 400 total iterations). To examine the importance 
of the inverse volume matrix on the linear solver performance, we perturbed 
our accurate cell volume calculations (for cut cells only) in the following way 

v ijk = 10^v ljk (28) 

where v^k is the accurate volume, fijk is a random factor between —1 and 1, 
and e is an "order of magnitude" parameter (e.g. for e = 1, the volume can 
be wrong by up to an order of magnitude). 

We also tested errors that are likely to result from a simple subsampling 
volume calculation routine. In this case, the model was 

v^k = v ijk + fijk^v (29) 

where v^k is the accurate volume, f^k is a random factor between —1 and 1, 
and Av is the perturbation (or subsample) volume. 
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Table [2] shows the effect of these errors on linear solver convergence for 
different values of e and Av/v Y&c where t> vac = AxAyAz, and indicates the 
importance of a good volume estimate (assuming accurate area fraction cal- 
culations). For example, according to Tabled a subsampling routine with 
10 samples per dimension of a grid cell (1000 cubic subvolumes) could lead to 
very poor convergence. Our volume calculation method breaks cut-cell vol- 
umes into tetrahedral subvolumes that conform to the embedded surface (the 
volumes of which are easy to calculate individually); therefore, our method 
converges to the exact cut-cell volume for smooth boundaries. 



Table 2: Effect of cut-cell volume calculation errors on linear solver convergence for the 
spherical cavity problem. Each e and Aw was simulated three times; the average iteration 
count from each was then averaged over these three simulations. Simulation domain was 
32x32x32. Inversions were performed to 10 -6 accuracy. 



Error from Eq. 


M 


e 


0.1 


0.3 


0.5 


1 


2 


3 


Avg. iteration count 


10 


11 


13 


24 


74 


220 


Error from Eq. 




Av/v vac 


10~ 5 


10~ 4 


10~ 3 


Avg. iteration count 


10 


50 


91 



6.2. Dey-Mittra frequency and field convergence 

This section compares frequencies and surface fields in the spherical cavity 
calculated by the Dey-Mittra algorithm with their analytic counterparts for 
varying ct min /a vac (where a vac = Ax Ay = AyAz = AxAz because cells are 
cubic). For these simulations, parity symmetry was invoked to reduce the 
simulation domain to the positive x, y, and z octant (sphere centered at the 
origin). Boundaries at x — 0, y — 0, and z = (the symmetry planes) were 
set as perfectly-conducting. The lowest eigenmode was analyzed which, for 
the above symmetry, corresponded to the TM 32 spherical cavity mode (3l| . 

Relative frequency errors were calculated by 

|w-w | 

£freq = (30) 

where u is the calculated angular frequency and Uq is the analytic angular 
frequency. All plots in this section use as the abscissa the resolution relative 
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to the vacuum wavelength of the mode, where the vacuum wavelength is 
defined as A vac = 2ttc/u . Figure @] shows the second-order convergence of 
the pure Dey-Mittra algorithm and the first-order error effect introduced by 
neglecting cut faces. One should also note the difference in relative error at 
low resolutions, which can be significant. 




10" 



10 1 



10 2 
A vac /Ax 



Figure 4: Color online. Convergence of TM032 resonant frequency in the spherical cavity 
as a function of resolution and a m i n /a vac . The unaltered Dey-Mittra algorithm is clearly 
second-order, while neglecting faces results in eventual first-order convergence (as similarly 
reported in 



Electric fields near the surface of the spherical cavity were calculated 



and compared with analytic solutions [3l||. Since interpolation of fields to 



the metal boundary is equivocal and nontrivial, we analyzed "surface" fields 
on a shell a radial distance 3Ax away from the boundary where standard 
trilinear interpolations are appropriate. Because surface fields were analyzed 
a fixed number of grid cell widths from the boundary, the physical distance 
from the boundary decreased as resolution increased. Before surface fields 
were compared, analytic and simulated fields were scaled to have the same 
£ 2 -norm where the norm was calculated by sampling each field over the same 
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collection of points filling the cavity volume. 

The relative £2 errors of the computed eigenmodes are, 



Afield 





e(x 


0-E(x,)| 2 


E* |E(x.) 


2 



(31) 



where the Xj are test points on a shell, E(x) is the analytic eigenmode eval- 
uated at x, and e(x) is the computed eigenmode interpolated to the point 
x (using trilinear interpolation). Figure |5] shows the results for the TM032 
spherical cavity mode. Surface fields ultimately do not converge when faces 
are neglected, yet converge with nearly second-order error for the pure Dey- 
Mittra algorithm. 




10 2 

A vac /Ax 



Figure 5: Color online. Convergence of TM032 surface fields in the spherical cavity as a 
function of resolution and a m i n /a vac . 



6.3. Performance: TESLA cavity 

Our eigensolver's performance (i.e. the linear solver performance) was also 
tested on a modern accelerator cavity problem — the TESLA superconducting 
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cavity [32] (see Fig. E]). One quarter of the nine-cell cavity was simulated 
where the x = and y = boundary planes were perfect magnetic conductor 
(the z-direction is the long direction); therefore, the TMoio-like accelerating 
modes were found at the lowest frequencies (~ 1.3 GHz). Three different 
resolutions were tested on the TESLA cavity shown in Fig. [6j additionally, a 
cryomodule was simulated, which consists of 8 TESLA cavities strung end- 
to-end in the z-direction. 

Simulation results are summarized in Table |3j the first 3 columns refer 
to the single TESLA cavity simulations and the last column the cryomodule. 
For the TESLA cavity simulations, the maximum eigensolver Krylov basis 
size (before a restart occurs) was 50, and the lowest 9 eigenmodes were found. 
Only 32 outer eigensolver iterations were required (no restarts necessary) 
at each resolution. For the cryomodule simulation, to save resources on 
Hopper (supercomputer at NERSC), we restricted the eigensolver to find only 
the lowest 3 eigenmodes and disallowed eigensolver restarts; 30 eigensolver 
iterations were performed. 

In Table [31 the estimated relative frequency error of the accelerating mode 
is given by £f rcq = (fAx/c) 2 where / = 1.3 GHz. 




Figure 6: Simulated quarter section of the 9-cell TESLA superconducting accelerator 
cavity. Beam tubes terminate at conducting walls. The length of the cavity is 1.38 meters. 



7. Conclusion 

We derived a matrix transformation for the Dey-M ittra curl-curl operator 
that allows for efficient inversion by multigrid methods. In fact, the perfor- 
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Table 3: Performance of linear solver during an eigensolve of the TESLA cavity. One 
polynomial (order 1) smoother sweep per level per traverse (same as spherical cavity 
simulations). Inversions performed to 10 -6 accuracy. Numbers in the last column apply 
to a cryomodule simulation — eight TESLA cavities end-to-end. 



Comp. count (xlO 6 ) 


0.8 


6.4 


50.2 


520 (cryo) 


Grid cell Ax (mm) 


2.9 


1.4 


0.72 


0.66 


Estimated £f req 


2 x 10~ 4 


4 x 10~ 5 


1 x 10~ 5 


8 x 10~ 6 


Avg. iteration count 


11.8 


11.9 


12.6 


13.9 


Convergence rate 


0.31 


0.31 


0.33 


0.37 


Multigrid complexity 


1.6 


1.6 


1.6 


1.6 


Multigrid levels 


5 


5 


5 


5 


Domain decomposition 


1x1x12 


2x2x24 


4x4x48 


4x4x418 



mance of multigrid applied to our custom Dey-Mittra vector Laplacian nearly 
equals the performance of multigrid on the model problem (grid-aligned cubic 
domain). As a result, we developed an efficient shift-and-invert eigensolver 
for Maxwell's equations in resonant cavities. Eigensolver performance was 
demonstrated on the simple spherical cavity and the TESLA superconducting 
cavity. This eigensolver is open-sourced at https : / / github . com/bauerca/ 
maxwell. 

The effects on accuracy of neglecting small cut faces in the Dey-Mittra 
algorithm were also investigated. An analysis of Dey-Mittra surface fields 
showed the stagnation of convergence (when faces are neglected) for fields a 
fixed number of grid cells away from conducting boundaries; in contrast, if 
all faces are kept (which is the case for our new eigensolver), surface field 
convergence is nearly second-order. 
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